AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Efficient quantized inference

# Efficient quantized inference

Mistral Small 3.1 24B Instruct 2503 Q4 K M GGUF
Apache-2.0
This is a GGUF format model converted from mistralai/Mistral-Small-3.1-24B-Instruct-2503, supporting multilingual text generation tasks.
Large Language Model Supports Multiple Languages
M
PatataAliena
124
1
LGAI EXAONE EXAONE Deep 2.4B GGUF
Other
This is the quantized version of LGAI-EXAONE's EXAONE-Deep-2.4B model, quantized using llama.cpp, supporting English and Korean text generation tasks.
Large Language Model Supports Multiple Languages
L
bartowski
304
1
T5 3b Q4 K M GGUF
Apache-2.0
This model is a quantized version converted from google-t5/t5-3b to GGUF format using llama.cpp via ggml.ai's GGUF-my-repo space.
Machine Translation Supports Multiple Languages
T
VVS2024
15
0
Finance LLM GGUF
Other
Finance LLM is a language model specialized in the financial domain, based on the Llama architecture, fine-tuned with datasets such as OpenOrca, Lima, and WizardLM.
Large Language Model English
F
TheBloke
641
21
Flan T5 Xxl Sharded Fp16
Apache-2.0
FLAN-T5 XXL is a variant of Google's T5 model, fine-tuned on over 1,000 additional tasks, supports multiple languages, and outperforms the original T5 model.
Large Language Model Transformers
F
philschmid
531
54
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase